Unravelling the dynamics of human development and economic growth on crude oil production based on ARDL and NARDL models

This paper estimates and establishes the causality between the Human Development Index (HDI), Gross Domestic Product (GDP), inflation and CO2 emissions on crude oil production (COP) in Cameroon from 1977 to 2019. To do so, the Augmented Dicky-Fuller and Zivot-Andrews stationarity tests, ARDL and NARDL modelling, as well as Toda-Yamamoto causality test are performed. Unlike previous studies on COP, this study incorporates the asymmetric impact (NARDL). The results indicate that CO2 emissions and GDP have a negative impact on COP in the long-run, while HDI and inflation have a positive impact in the short-run. GDP and HDI have a non-linear impact in the short run, while in the long-run inflation and CO2 emissions have a non-linear impact on COP. From these results, it is interesting to note that, in order to allow future generations to benefit from the oil windfall. The diversification of the Cameroonian economy, the control of inflation and the use of less polluting crude oil extraction technologies must be imperative.• A step-by-step procedure of the ARDL, NARDL and causality test is provided.• The multiplier effects of GDP, HDI, inflation and CO2 emissions on COP are simulated.• The impact of GDP and HDI on COP is non-linear


Introduction
The exploitation of natural resources and their use to achieve a high level of development remains a major challenge for countries [2] .As crude oil is a non-renewable and limited natural resource, it is incorporated in almost all sectors of activity [3] .Although crude oil is used in all sectors of activity, it has adverse effects on the environment, such as air pollution and greenhouse gas emissions.[4] .Thus, reconciling crude oil development with economic development and environmental protection is not an easy task for policy makers.In this respect, we assess the linear and non-linear effect of economic growth, CO2 emissions, inflation and the human development index (HDI) on crude oil production (COP).As well as causal links using ARDL, NARDL and the Toda-Yamamoto causality test.This work is in parallel with the work of Ahmad and Du [4] and the work of Danish et al. [5] and Bildirici [6] which were limited to linear ARDL modelling.This paper contributes to the literature in several ways.First, it is the first to jointly estimate the impact of CO2 emissions, HDI, inflation and economic growth on COP in Cameroon.Secondly, in addition to the ARDL model used in [4][5][6] which separately estimates the short-and long-run linear effects of the exogenous variables on the endogenous variables.This paper enables to captures the non-linear effects of CO2 emissions, HDI, inflation and economic growth on COP through the NARDL model which, to our knowledge, appears to have been unexplored in Central Africa and Cameroon in particular.The application of ARDL, NARDL and the Toda-Yamamoto causality test to this work allows for an efficient interpretation of the economic results.By taking into consideration the various positive or negative shocks that can be experienced in time series.Specifically, it aims to provide conclusive answers on the link between COP, economic growth, CO2 emissions and inflation in Cameroon.

Details of the method
Before starting, it seems necessary to recall the basic notions of modelling an autoregressive delay staggered model and to highlight its qualities.An ARDL model is an autoregressive lag model capable of taking into account the temporal dimension in the explanation of a time series and making anticipations.As for the NARDL model, it is a non-linear staggered lag autoregressive model that allows capturing the different breaks or shocks that can occur on a series.

Dataset and data sources
The data used in this paper are annual data covering the period from 1977 to 2019.The year 1977 was chosen as the starting point because it marks the first production year of crude in Cameroon.The year 2019 was chosen as the end of the study's period because of the unavailability of data for some of the variables used.Table 1 shows the data sources for the different variables used.

Empirical model
The theoretical conclusions demonstrating the functional relationship between COP and the variables mentioned in this study are based on the study of Reynolds and Kolodziej [10] on the relationship between COP and economic growth in the former Soviet Union.Similarly, existing and related studies have also examined the relationship between COP and other explanatory variables such as foreign investment, geopolitical risks, CO2 emissions, etc. [ 3 , 5 , 11-13 ].Olanipekun and Alola [11] in particular have examined and highlighted the impact of environmental damage costs, rent and geopolitical risk on COP in the Persian Gulf.Thus, by taking into account CO2 emissions, environmental quality, human development index (HDI) and inflation (INF), this study is in line with the works carried out by Reynolds and Kolodziej [10] , and Olanipekun and Alola [11] .Eq. (1) illustrates the dynamics of the links between economic growth, environmental quality and HDI on COP: Introducing the natural logarithm into Eq (1) makes it possible to smooth out the various variables and interpret them from the point of view of elasticities.The following empirical model is therefore used: In Eq. ( 2) ,    ,     ,    , 2  and     are the logarithmic forms of COP, GDP, HDI, CO2 emissions and inflation respectively, while   is the error term and  0 is a constant.

Stationarity test
The dynamics of ARDL and NARDL modelling require that all series are level stationary (i.e.integrated of order 0, denoted I(0)) or first difference (denoted I(1)).To avoid spurious results1 none of the series should be integrated of order two (i.e.I(2)).In addition, the independent time series must be integrated of order, I(1) [14] .Several tests can be used to check the stationarity of the variables.For this purpose, this paper uses the Augmented Dickey-Fuller (ADF) test which is an efficient test in case of the presence of autocorrelations of the errors but very likely to reject the null hypothesis of unit root for series presenting structural breaks [15] .To overcome the shortcomings of ADF test, the Zivot-Andrews (ZA) stationarity test is used.The later takes into account different structural breaks when performing the unit root test [16] .The ADF and ZA tests are performed under the following hypothesis:  0 : the series have a unit root versus  1 : the series do not have a unit root.
The null hypothesis  0 is rejected if the  - is less than 5 % (  - < 0 .05 ).To control for possible spurious regression, Table 2 presents the results of the ADF and ZA stationarity tests.LCOP, LGDP, LHDI, LCO2 and LINF represent the logarithm of COP, gross domestic product (GDP), HDI, CO2 emissions and inflation respectively.
From Table 2 , it can be seen that all variables are stationary and there is no integrated series of order I(2) which meets the conditional requirements of the ARDL and NARDL model.

Estimation of the ARDL model
Studying the relationship between COP in Cameroon and CO2 emissions, inflation, HDI and GDP is based on ARDL bounds test.This procedure is developed by Pesaran et al. [17] .It has a number of advantageous features: (i) ARDL approach provides the best linear and undistorted prediction of the long-run relationship; (ii) data size does not affect the effectiveness of the ARDL approach; (iii) the evaluation of the long-run and short-run relationships is done simultaneously; (iv) whether the series are I(0), I(1) or both, the ARDL approach can still be used [18] ; (v) ARDL approach solves the problems of series endogeneity and autocorrelation; (vi) The lag sequences of the series are not always the same; and (vii) Its distinctive equation configuration makes it easier to use and interpretations are straightforward.

ARDL limit test
The limited ARDL test, which captures long-run and short-run effects, evaluates the following unrestricted error correction mechanism via ordinary least squares and is defined as follows Where with  0 is the component of the derivative,   is the error term and assumed to be independently and normally distributed with zero mean and constant variance.The terms   and   denote the short-run and long-run multipliers respectively.Δ denotes the first difference operator,  represents the time period and  is the maximum number of lags in the model, which is determined by several criteria.The AKaike Information Criterion (AIC) is the basic criterion used in this model to determine the optimal lag.The optimal shift and model chosen are those that minimise the value of AIC.The procedures listed below are used in this work to apply the ARDL test for cointegration limits.The first procedure is the null hypothesis which states that there is no cointegration or a long-term relationship between the series.The second is a non-null hypothesis which supports the existence of cointegration.
The cointegration of Eq. ( 3) is demonstrated if at least one of the long-run multipliers is different from zero.The existence of cointegration between COP and CO2 emissions, inflation, HDI and GDP is sufficiently demonstrated by rejecting the null hypothesis H 0 .The second procedure is to compare the F-statistic to the two critical limits constructed by Pesaran et al. [17] in order to determine whether there are any level-related relationships.The lower critical limit is the first set of critical values, which assumes that the variables are integrated in the zero order I(0); the upper critical limit is the second set of critical values, which assumes that the variables are integrated in the one order I(1).H 0 is rejected in favour of H 1 if the F-statistic is above the upper critical limit and vice versa.If the F-statistic lies between the bounds, then the test is undecided.Thus, the calculated Fisher test statistic of value F, is compared to the critical values that form the bounds as follows: • If Fisher > upper bound: There is cointegration; • If Fisher < lower bound: There is no cointegration; If lower bound < Fisher < upper bound: No conclusion Using the AIC criterion, we can see from Fig. 1 below that the optimal shift of the series is of order 3.The ARDL model (2, 2, 1, 2, 2) is the most optimal model amongst the nineteen other models, it is the one that minimises the Akaike criterion.As for the cointegration, Table 3 presents a value of F-stat which is 7.283 higher than the values of the critical limits.The value of F-stat is higher than the values of the critical limits, we conclude that there is a cointegration relationship between COP and GDP, CO2 emissions, HDI and inflation.
Having found the existence of a long-term relationship, the next step is to estimate the error correction model (ECM).The ECM can be expressed as follows: is the coefficient of the error correction term   −  which measures the speed of adjustment from the short-run equilibrium of the estimated ARDL model to its long-run equilibrium.The error correction coefficient must be negative and between 0 and 1 in absolute value.Tables 4 and 5 below show the short-and long-term results of the ARDL model (2,2,1,2,2) respectively.Table 4 shows a highly significant adjustment coefficient of less than 1 in absolute value.With the limited ARDL test in place, as well as the short-and long-term relationships, it is necessary to check some assumptions for the viability of the results hence the next step.

Robustness test of the ARDL model
The different assumptions of independence, normal distribution with zero mean and constant variance made on the random disturbance terms of Eqs. ( 3) and (4) require to check whether it holds or not.
Normality is tested with the Jarque-Bera test under the following assumptions:  0 : the errors are normally distributed (  −  > 5% ) against  1 : the errors are not normally distributed (  −  < 5% ) .
Finally, the Breusch-Pagan-Godfrey test is used to test for zero conditional mean and constant variance.The Breusch-Pagan-Godfrey test is performed under the hypothesis:  0 : There is no heteroscedasticity (  −  > 5% ) against  1 : There is heteroscedasticity (  −  < 5% ) When using cointegration methods, it is essential to ensure that the model is accurate, as any incorrect specification leads to instability.In order to check for parameter instability and misspecification, this study uses the cumulative sum (CUSUM) of recursive residuals and the CUSUM of squares (CUSUMSQ), as recommended by Pesaran and Pesaran [19] .
The ARDL (2,2,1,2,2) model is statistically robust.The different empirical tests presented in Table 6 confirm that the absence of heteroscedasticity, the normality of the residuals and alternatively the functional form of the specifications is correct.As for the stability of the model, Fig. 2 below shows us that the model is stable as a whole, the different Cusums do not cross the intervals defined at the 5 % threshold.

Interpretation of ARDL results
Once the model has been validated, the short-and long-term results need to be interpreted.In the short term, an increase of 1 % in GDP leads to an increase in COP of 0.59 %.The time dimension is not negligible: a year ago, a 1 % increase in GDP led to a 0.63 % increase in COP.CO2 emissions, inflation and HDI are not statistically significant at the 5 % threshold in the short term.These imply that short-term economic policies favouring wealth creation encourage Cameroon to increase its COP.On the other hand, policies related to environmental protection and human development have no impact on COP in the short term.
In the long term, a 1 % increase in CO2 emissions leads to a 0.53 % and 3.27 % decrease in COP and GDP respectively.A 1 % increase in HDI and inflation leads to an increase in COP of 5.91 % and 1.52 % respectively.These results imply that long-term economic policy initiatives to promote wealth creation will depend less and less on COP in Cameroon.As for CO2 emissions, the results imply that environmental protection policies may weaken COP.On the other hand, the Cameroonian government's long-term policy in favour of human development will stimulate COP.What would happen in the event of a sudden change (positive or negative shock) in inflation, CO2 emissions, gross domestic product and the human development index on COP? Hence the NARDL modelling.

NARDL model procedure
The ARDL model captures the short-and long-term linear effects of the series under study (the symmetric relationship).However, it fails to capture the asymmetric relationship between the series under study (the effects of a sudden change in the variables).Moreover, the NARDL model allows for the simultaneous estimation of short-and long-term asymmetries.This explains the choice of using the non-linear ARDL model in our study.NARDL modelling does not require all variables to be integrated in the same order.Drawing on the methodology of Shin et al. [20] .The NARDL model is written as follows: Where   +  and   −  are the positive and negative partial sums of   of Eq. ( 6) ; 2 +  and 2 −  are the positive and negative partial sums of the carbon dioxide emissions 2 emissions from Eq. ( 7) ;   +  and   −  represent the positive and negative partial sums of the HDI of Eq. ( 8) ; and    +  and    −  represent the positive and negative partial sums of inflation in Eq. (9) .
Where Δ +  and Δ −  is the process that captures increases and decreases in GDP.
Δ2 +  and Δ2 +  is a process that captures increases and decreases in CO2 emissions.
Likewise, Δ  +  and Δ  −  also capture increases and decreases in the HDI.

𝐿𝐼 𝑁 𝐹
Δ   +  and Δ   −  process that captures increases and decreases in inflation.Or  is defined as the optimal lag.Fig. 3. Graphical values of AIC.
NARDL limit test NARDL, just like the ARDL model requires an examination of the existence of a long-run relationship between the series.Thus, NARDL representation that captures the asymmetric effects that GDP, inflation, HDI and CO2 emissions can have on COP.Consistent with the demonstration of Shin et al. [20] Eq. ( 10) is rewritten as follows: Where ∑   =1  ′  captures the positive and negative short-term effects respectively.While,  −  an d  +  capture the long-run negative and positive effects of GDP, HDI, inflation and CO2 emissions on COP respectively. ′  is the error term that is independently and normally distributed with zero mean and constant variance.Δ denotes the first difference operator,  represents the time period and  is the maximum number of lags of the model which is determined through several criteria.The main criterion used in this model to determine the ideal lag is the AIC criterion.The optimal lag and model are those that minimise the AIC value.The ARDL test for cointegration limits is applied in this work using the procedures listed below.The first step is to reject the null hypothesis, which states that there is no cointegration or long-term relationship between the series.The second is a non-null hypothesis that confirms the existence of cointegration.
Cointegration is conducted under the following hypothesis: The cointegration of Eq. ( 10) is demonstrated if at least one of the long-run multipliers is different from zero.The existence of cointegration between COP and the various positive and negative shocks to CO2 emissions, inflation, HDI and GDP is sufficiently demonstrated by rejecting the null hypothesis H 0 .The second step is to compare the F-statistic to the two critical bounds produced by Pesaran et al. [17] to see if there are any level related relationships.The lower critical limit is the first set of critical values, which assumes that the variables are integrated at zero order I(0); the upper critical limit is the second set of critical values, which assumes that the variables are integrated at order one I(1). ′ 0 is rejected in favour of  ′ 1 if the F-statistic is above the upper critical limit and vice versa.If the F-statistic lies between the bounds, then the test is undecided.Thus, the calculated Fisher test statistic of value F, is compared to the critical values that form the bounds as follows: • If Fisher > upper bound: Cointegration exists; • If Fisher < lower bound: Cointegration does not exist; If lower bound < Fisher < upper bound: No conclusion The optimal lag of the series, according to the AIC criterion, is of order 3, as shown in Fig. 3 .The NARDL model (2, 1, 0, 0, 0, 0, 1, 1,0) is the most optimal model amongst the nineteen other models, and it is the one that minimises the AIC criterion.As for the    7 presents a value of F-stat which is 4.11 higher than the values of the critical limits.The value of F-stat being higher than the values of the critical limits, we conclude that there is a cointegration relationship.Once a long-term relationship has been established, the next step is to estimate the error correction model (ECM).The ECM can be expressed as follows: ′ is the coefficient of the error correction term  ′  which measures the speed of adjustment from the short-run equilibrium of the estimated NARDL (2, 1, 0, 0, 0, 0, 1, 1,0) model to its long-run equilibrium.The error correction coefficient must be negative and between 0 and 1 in absolute value.Tables 8 and 9 below show the short-and long-term results of the NARDL model respectively.Table 8 shows a highly significant adjustment coefficient of less than 1.
As the short-and long-term relationships of the NARDL model are estimated for greater certainty, it is necessary to check the reliability of the model.The reliability of the NARDL model results in this study requires the verification of certain assumptions, hence the next step.

Robustness test of the NARDL model
The NARDL model (2,1,0,0,0,0,1,1,0) could be valid, if it takes into account the different assumptions of independence, normal distribution made on the error term of Eqs.(10) and (11) .Thus, to check the robustness of the NARDL model, we proceed in exactly the same way as the ARDL model in section 2.4.2.
Statistically, the NARDL model (2, 1, 0, 0, 0, 0, 1, 1, 0) is valid.The different empirical tests presented in Table 10 confirm the absence of heteroscedasticity; the normality of the residuals and alternatively the functional form of the specifications is correct.In addition to this, the NARDL model (2, 1, 0, 0, 0, 0, 1, 1, 0) is globally good and explains 97 % of the dynamics of the different shocks of the different explanatory series on COP over the period 1977 to 2019.Following the assessment, a look at the stability ( Fig. 4 ) shows that the NARDL model (2, 1, 0, 0, 0, 0, 1, 1, 0) is stable overall.This is because the individual Cusums do not cross the intervals  defined at the 5 % threshold.The Wald test is then applied in the next step in order to confirm the asymmetric effect in the short and long term as shown in Tables 8 and 9 .

Wald test
The NARDL model (2,1,0,0,0,0,1,1,0) being robust and cointegration being justified.To confirm the asymmetric short-run and long-run effects of GDP, CO2 emissions, HDI and inflation on COP in Cameroon.The standard Wald test is used and is based on the following assumptions: • Short-term assumption : • Long-term assumption ) absence of asymmetry and, The symmetrical short-or long-term effects of GDP, HDI, CO2 emissions and inflation are confirmed by the rejection of the null hypothesis H 0 .The hypothesis  0 is rejected if the calculated probability is greater than  - > 5% .The results of the Wald test presented in Table 11 show that GDP and HDI have asymmetric effects in the short run.The results of the Wald test in Table 11 show that GDP and HDI have asymmetric effects in the short run, while in the long run CO2 emissions and inflation have asymmetric effects.

Interpretation of NARDL results
Short-term asymmetric analysis indicates strong non-linearity between HDI and GDP.A positive shock in GDP of 1 % leads to an increase in COP of 0.77 %.A negative shock in HDI of 1 % leads to a decrease in COP of 1.64 %.This means that in the event of a fall in HDI, possibly due to education, the resulting effect will be a significant fall in COP.The positive GDP shock implies that a radical change in economic policy aimed at accelerating wealth creation in Cameroon would lead to a significant increase in COP.
In the long term, asymmetric analysis indicates a strong non-linearity between inflation and GDP on COP.Thus, a positive shock of 1 % in inflation, as shown in Table 9 , leads to an increase in COP of 0.95 %.This means that a sudden rise in the prices of goods and services would lead to an increase in COP.CO2 emissions are not significant.This implies that oil production could take place independently of a positive or negative shock to CO2 emissions.The Wald test confirms the asymmetric short-and long-term effects of GDP, HDI, CO2 emissions and inflation on COP.

Dynamics of the multiplier effect
The dynamic multiplier effect is evaluated, or a variation of 1 %, From Eqs. (10) and (12a-d), the long-term asymmetric coefficient is estimated as = −  +  ; − = −  −  given that p →∞ The illustrations in Fig. 5 show that the positive GDP shock clearly dominates the negative shock in the overall asymmetry.On the other hand, the negative shock from inflation, HDI and CO2 emissions clearly dominates the positive shock in the global asymmetry.

Toda-Yamamoto causality
Since the series are combined in different orders, the traditional Granger causality test is inoperative.This makes it possible to use the Toda and Yamamoto causality test [21] .In concrete terms, this involves estimating a level-corrected vector autoregression (VAR), which is to serve as the basis for the causality test, under the hypothesis of a probable cointegration between the series.The absence of a causal relationship between the series is specified by the null hypothesis.Thus, the causality test procedure of [21] is as follows: • Find the maximum order of integration (  ) of the sub-study series using conventional stationarity tests; • Determine the optimal lag or shift (  ) of the model's VAR or autoregressive polynomial (AR) using the information criteria (AIC); • Estimating a level-augmented VAR of order ≪  =  +  ≫ .Concerning the estimation of the VAR in augmented level, the stationarity conditions of the series will define the number of lags to be added to the VAR.In fact, for stationary series in level, no lag is added to the VAR (standard test procedure); on the other hand, for I(1) series, a lag will be added to the VAR, and so on.• Check the robustness of the VAR model (  =  +  ) using various available diagnostic tests [22] .
• The Wald test is performed on the initial parameters  parameters and provides an asymptotic Chi-square distribution with degrees of freedom (for more information, [22] ).
Model 3: HDI and COP Model 4: Inflation and COP  The interpretation of the causal inference of models 3 to 5 follows the same logic as for models 1 and 2. Before performing the Toda-Yamamoto test, the stationarity of the series should be checked.Table 2 presents the results of the stationarity of the series.The finding is that the series are integrated at level and first difference, the maximum integration order is set to one I(1).The optimal lag length using the AIC criterion is set to 3, which allows the VAR to satisfy both stability conditions and the absence of autocorrelations.The Toda-Yamamoto causality obtained from the VAR is presented in Table 12 .Fig. 6 summarises the causal links between the different variables.

Conclusion
ARDL and NARDL models and the Toda-Yamamoto causality test were used in this study to estimate and establish the causal links between GDP, HDI, CO2 emissions and inflation on COP in Cameroon over the period 1977 to 2019.The combination of the ARDL and NARDL method and the Toda-Yamamoto causality test provides sufficient between the variables to guide policy in their decision Inflation control should be an essential element of economic policy, as it has a positive impact on COP.Diversification of economy should also be an essential element for the country's development.A positive shock to economic growth has a negative impact on COP.Finally, the rents from COP should be directed towards education, health and, in short, towards social projects to improve the living conditions of the population.There is no causality between and HDI.Environmental regulations could be further strengthened to that COP has less impact on the environment.This study does not include institutional quality as a series that can explain COP.However, institutional quality is very important in the decision-making process of oil resource investments.It would be interesting for future research to examine the relationship between oil production and institutional quality.

Table 1
Data description.

Table 2
Unit root test.

Table 3
Cointegration test at the bounds.

Table 6
Diagnostic test results of the estimated model.
* * significant at the 1 % and 5 % level respectively

Table 9
Long-term dynamics of NARDL.

Table 10
Results of the different diagnostic tests of the estimated model.